Feature-Rich Twitter Named Entity Recognition and Classification
نویسندگان
چکیده
Twitter named entity recognition is the process of identifying proper names and classifying them into some predefined labels/categories. The paper introduces a Twitter named entity system using a supervised machine learning approach, namely Conditional Random Fields. A large set of different features was developed and the system was trained using these. The Twitter named entity task can be divided into two parts: i) Named entity extraction from tweets and ii) Twitter name classification into ten different types. For Twitter named entity recognition on unseen test data, our system obtained the second highest F1 score in the shared task: 63.22%. The system performance on the classification task was worse, with an F1 measure of 40.06% on unseen test data, which was the fourth best of the ten systems participating in the shared task.
منابع مشابه
Bidirectional LSTM for Named Entity Recognition in Twitter Messages
In this paper, we present our approach for named entity recognition in Twitter messages that we used in our participation in the Named Entity Recognition in Twitter shared task at the COLING 2016 Workshop on Noisy User-generated text (WNUT). The main challenge that we aim to tackle in our participation is the short, noisy and colloquial nature of tweets, which makes named entity recognition in ...
متن کاملبهبود شناسایی موجودیتهای نامدار فارسی با استفاده از کسره اضافه
Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...
متن کاملLearning to Search for Recognizing Named Entities in Twitter
This paper describes our participation in the shared task Named Entity Recognition in Twitter organized as part of the 2nd Workshop on Noisy User-generated Text. The shared task comprises two sub-tasks, concerning a) the detection of the boundaries of entities and b) the classification of the entities into one of 10 possible types. The proposed approach is based on Linked Open Data for extracti...
متن کاملTwitter Named Entity Extraction and Linking Using Differential Evolution
Systems that simultaneously identify and classify named entities in Twitter typically show poor recall. To remedy this, the task is here divided into two parts: i) named entity identification using Conditional Random Fields in a multi-objective framework built on Differential Evolution, and ii) named entity classification using Vector Space Modelling and edit distance techniques. Differential E...
متن کاملTHE JOHNS HOPKINS UNIVERSITY Nerit: Named Entity Recognition for Informal Text
We describe a multilingual named entity recognition system using language independent feature templates, designed for processing short, informal media arising from Twitter and other microblogging services. We crowdsource the annotation of tens of thousands of English and Spanish tweets and present classification results on this resource.
متن کامل